Nonparametric Learning in High Dimensions

نویسندگان

  • Han Liu
  • Zenghai Liu
  • Zhen Han
  • Larry Wasserman
چکیده

This thesis develops flexible and principled nonparametric learning algorithms to explore, understand, and predict high dimensional and complex datasets. Such data appear frequently in modern scientific domains and lead to numerous important applications. For example, exploring high dimensional functional magnetic resonance imaging data helps us to better understand brain functionalities; inferring large-scale gene regulatory network is crucial for new drug design and development; detecting anomalies in high dimensional transaction databases is vital for corporate and government security. Our main results include a rigorous theoretical framework and efficient nonparametric learning algorithms that exploit hidden structures to overcome the curse of dimensionality when analyzing massive high dimensional datasets. These algorithms have strong theoretical guarantees and provide high dimensional nonparametric recipes for many important learning tasks, ranging from unsupervised exploratory data analysis to supervised predictive modeling. In this thesis, we address three aspects: 1 Understanding the statistical theories of high dimensional nonparametric inference, including risk, estimation, and model selection consistency; 2 Designing new methods for different data-analysis tasks, including regression, classification, density estimation, graphical model learning, multi-task learning, spatial-temporal adaptive learning; 3 Demonstrating the usefulness of these methods in scientific applications, including functional genomics, cognitive neuroscience, and meteorology. In the last part of this thesis, we also present the future vision of high dimensional and large-scale nonparametric inference.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Nonparametric Models

A Bayesian nonparametric model is a Bayesian model on an infinite-dimensional parameter space. The parameter space is typically chosen as the set of all possible solutions for a given learning problem. For example, in a regression problem the parameter space can be the set of continuous functions, and in a density estimation problem the space can consist of all densities. A Bayesian nonparametr...

متن کامل

Bayesian Nonparametric Regression with Local Models

We propose a Bayesian nonparametric regression algorithm with locally linear models for high-dimensional, data-rich scenarios where real-time, incremental learning is necessary. Nonlinear function approximation with high-dimensional input data is a nontrivial problem. For example, real-time learning of internal models for compliant control may be needed in a highdimensional movement system like...

متن کامل

An Analysis of Scrap Learning Dimensions in Technical & Vocational Training (Work& Knowledge Branch) Senior High Schools

 Introduction: Undesirable teachings or curriculum prepare the ground for training unqualified people Among these, improving available curriculum are very significant. The present study aims to analyze scrap learning dimensions in technical and vocational training (work & knowledge branch) senior high schools. The study uses a qualitative approach.  Methodology: The present study was qualita...

متن کامل

Dimensions and Indices of Leading Transfer of Learning: A Phenomenological Study

In order to identify the dimensions and indices of the phenomenon of “leading transfer of learning” within the secondary schools of Tehran province, a sample of 17 lecturers, principals, and experienced teachers was targeted and then interviewed. Inductive analyses of the collected data at three levels of open, pivotal, and selective coding indicate that the role of high school principals along...

متن کامل

Nonparametric Bayesian Models for Unsupervised Learning

NONPARAMETRIC BAYESIAN MODELS FOR UNSUPERVISED LEARNING Pu Wang, PhD George Mason University, 2011 Dissertation Director: Carlotta Domeniconi Unsupervised learning is an important topic in machine learning. In particular, clustering is an unsupervised learning problem that arises in a variety of applications for data analysis and mining. Unfortunately, clustering is an ill-posed problem and, as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010